Highly Parallel Sparse Cholesky Factorization
نویسندگان
چکیده
We develop and compare several fine-grained parallel algorithms to compute the Cholesky factorisation of a sparse matrix. Our experimental implementations are on the Connection Machine, a distributedmemory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special-purpose algorithms in which the matrix structure conforms to the connection structure of the machine, our focus is on matrices with arbitrary sparsity structure. Our most promising algorithm is one whose inner loop performs several dense factorisations simultaneously on a twodimensional grid of processors. Virtually any massively parallel dense factorisation algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorisation from realising its potential efficiency, we conclude that s regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. We also present a performance model and use it to analyse our algorithms. We find that asymptotic analysis combined with experimental measurement of parameters is accurate enough to be useful in choosing among alternative algorithms for a complicated problem. *Xerox Palo Alto Research Center, 3333 Coyote Hi]] Road, Palo Alto, California 94304. Copyright _) 1990 Xerox Corporation. All rights reserved. tResearch Institute for Advanced Computer Science, MS 230-5, NASA Ames Research Center, Moffett Field, CA 94035. This author's work was supported by the NAS Systems Division and DARPA via Cooperative Agreement NCC 2-387 between NASA and the University Space Research Association (USRA).
منابع مشابه
Scalable Parallel Algorithms for Solving Sparse Systems of Linear Equations∗
We have developed a highly parallel sparse Cholesky factorization algorithm that substantially improves the state of the art in parallel direct solution of sparse linear systems—both in terms of scalability and overall performance. It is a well known fact that dense matrix factorization scales well and can be implemented efficiently on parallel computers. However, it had been a challenge to dev...
متن کاملSupernodal Symbolic Cholesky Factorization on a Local-Memory Multiprocessor
In this paper, we consider the symbolic factorization step in computing the Cholesky factorization of a sparse symmetric positive definite matrix on distributedmemory multiprocessor systems. By exploiting the supernodal structure in the Cholesky factor, the performance of a previous parallel symbolic factorization algorithm is improved. Empirical tests demonstrate that there can be drastic redu...
متن کاملHighly Scalable Parallel Algorithms for Sparse Matrix Factorization
In this paper, we describe scalable parallel algorithms for sparse matrix factorization, analyze their performance and scalability, and present experimental results for up to 1024 processors on a Cray T3D parallel computer. Through our analysis and experimental results, we demonstrate that our algorithms substantially improve the state of the art in parallel direct solution of sparse linear sys...
متن کاملA Highly Scalable Parallel Algorithm for Sparse Matrix Factorization
In this paper, we describe a scalable parallel algorithm for sparse matrix factorization, analyze their performance and scalability, and present experimental results for up to 1024 processors on a Cray T3D parallel computer. Through our analysis and experimental results, we demonstrate that our algorithm substantially improves the state of the art in parallel direct solution of sparse linear sy...
متن کاملOn Evaluating Parallel Sparse Cholesky Factorizations
Though many parallel implementations of sparse Cholesky factorization with the experimental results accompanied have been proposed, it seems hard to evaluate the performance of these factorization methods theoretically because of the irregular structure of sparse matrices. This paper is an attempt to such research. On the basis of the criteria of parallel computation and communication time, we ...
متن کاملA PERFORMANCE STUDY OF SPARSE CHOLESKY FACTORIZATION ON INTEL iPSC/860
The problem of Cholesky factorization of a sparse matrix has been very well investigated on sequential machines. A number of efficient codes exist for factorizing large unstructured sparse matrices, for example, codes from Harwell Subroutine Library [4] and Sparspak [7]. However, there is a lack of such efficient codes on parallel machines in general, and distributed memory machines in particul...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- SIAM J. Scientific Computing
دوره 13 شماره
صفحات -
تاریخ انتشار 1992